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INTRODUCTION: 

Expression  of  an  inherited  disease  gene  does  not  always  follow  a  Mendelian 
inheritance  pattern.  Since  the  turn  of  the  century,  it  has  been  observed  that  the 
transmission  of  certain  disease  genes  from  one  generation  to  the  next  is  associated 
with  an  increase  in  the  severity  of  the  disease  symptoms  and/or  a  decrease  in  the  age 
of  onset.  This  clinical  phenomenon  is  called  anticipation  and  it  was  observed  with  many 
inherited  diseases,  especially  ones  involving  neurological  disorders.  However,  the 
absence  of  a  satisfactory  physical  explanation  for  anticipation  at  the  molecular  level 
resulted  in  much  controversy  in  the  past  over  the  acceptance  of  the  concept  of  genetic 
anticipation. 

Advances  from  molecular  genetic  analysis  of  many  of  the  inherited  human 
diseases  in  which  anticipation  has  been  observed  has  recently  attributed  anticipation  in 
these  diseases  to  a  new  class  of  dynamic  mutations,  characterized  by  trinucleotide 
repeat  expansions  in  the  locus  where  the  disease  genes  have  been  mapped.  The  effect 
depends  on  the  location  of  the  repeats  relative  to  the  gene  and  the  type  of  repeats  (for 
review  see  Sanjeeva  et.  al.,  1997,  Margolis  et  al.,  1999,  Vincent  et  al.,  2000).  The 
molecular  consequences  that  result  from  trinucleotide  repeat  expansions  and  the 
mechanism  by  which  they  lead  to  pathology  may  be  quite  diverse.  In  general  however, 
trinucleotide  repeat  expansions  have  been  shown  to  perturb  either  the  structure  and 
function  (type  I  mutations)  or  the  expression  (type  II  mutations)  of  the  affected  gene  (for 
review  see  Sanjeeva  et.  al.,  1997,  Margolis  et  al.,  1999,  Vincent  et  al.,  2000).  For 
example,  expansion  of  a  (CAG)n-repeat  in  the  coding  region  of  the  Huntington’s 
disease  gene  allows  expression  of  an  altered  protein  which  contains  an  expanded 
polyglutamine  region,  resulting  in  altered  conformation,  processing  and  general  physical 
properties  of  the  protein  function  of  the  product  (Trottier  et  al.,  1995).  Alternatively, 
expansion  which  results  in  200-2000  CGG  repeats  in  the  5’  untranslated  region  of  the 
Fragile  X  syndrome  (FRAXA)  gene  result  in  the  loss  of  expression  of  FRAXA  mRNA 
(Pieretti  et.  al.,1991).  As  the  repeat  expands  with  transmission  to  the  next  generation, 
the  CGG  repeats  become  more  methylated  reducing  transcription  of  the  FRAXA  gene. 
The  nature  of  this  dynamic  group  of  mutations,  which  can  involve  very  large 
amplification  of  trinucleotide  repeats,  renders  the  sequence  unstable  during  meiosis. 
This  results  in  intergenerational  instability  of  the  length  of  the  trinucleotide  repeat.  It  is 
not  known  why  repeats  which  exceed  a  critical  value  are  unstably  transmitted  to 
succeeding  generations  with  a  tendency  towards  expansion  of  trinucleotide  repeats. 
There  is  however,  a  very  clear  association  between  longer  expansions  at  the  disease 
locus  in  the  succeeding  generations  and  earlier  clinical  manifestation  (Sanjeeva  et.  al., 
1997,  Margolis  et  al.,  1999,  Vincent  et  al.,  2000).  Anticipation,  therefore,  is  now 
commonly  accepted  as  a  hallmark  of  the  inheritance  of  an  amplified  trinucleotide  repeat 
expansion  mutation.  Thus  far,  at  least  12  genetic  diseases  (mostly  muscular  or 
neurological  disorders)  have  been  attributed  to  expansions  of  trinucleotide  repeats  in 
the  loci  containing  the  disease  gene.  These  diseases  are  characterized  as  having 
increasing  copy  numbers  of  the  unstable  expanded  sequences  with  subsequent 
generations. 

One  can  envision  that  expansion  of  trinucleotide  repeats  are  not  restricted  to 
neuro-muscular  disorders,  and  they  probably  represent  a  novel  class  of  dynamic 
mutations  causing  various  human  diseases.  A  study  by  O’Donovan  et  al.,  1996, 
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provided  an  intriguing  finding  suggesting  that  repeat  expansions  have  a  wide  spread 
role  in  common  human  diseases.  This  study  has  shown  that  older,  healthy  individuals 
have  generally  shorter  CAG-repeat  lengths  than  their  younger  counterparts.  The 
demonstration  of  decreased,  genome  wide  repeat-copy  number  with  age  in  healthy 
populations  suggests  that  the  dynamic  mutation  could  have  a  wide-spread  role  in 
human  susceptibility  to  disease.  These  findings  also  suggest  that  trinucleotide  repeat 
diseases  do  not  display  a  single  major  gene  inheritance. 


Statement  of  Work 


Technical  Objective  1:  Cloning  of  Gene  Sequences  with  CAG-Repeat  Expansion. 

(Four  parallel  cloning  procedures  will  be  performed  simultaneously  using  4  pools  each 
including  DNA  mixed  from  2  RED  positive  cases) 


Taskl:  Months  1-4 


Task  2:  Months  4-6 


Task  3:  Months  6-8 


Task  4:  Months  8-10 


Task  5:  Months  10-11 


Enrichment  by  digestion  of  DNA  samples  and  separation  on 
agarose  gels.  Slicing  (2-4mm),  and  extracting  the  DNA  from 
each  slice  (approximately  200-400  slices).  RED  analysis  of 
all  200-400  slices  to  locate  the  fragments  with  expansion. 
Cloning  into  Zapll  system,  electroporation  into  E.Coli,  and 
amplification.  Secondary  enrichment  by  pooling:  Plating  the 
bacteria,  pooling  and  amplification  of  pools.  Extraction  of 
DNA  (-100-200)  and  RED  analysis  (~  100-200). 
Transformation  of  the  RED  positive  DNA  into  E.Coli  CJ236 
strain  and  generation  of  ssDNA.  Isolation  of  the  ssDNA  and 
production  of  dsDNA  containing  only  CAG  inserts  CAG- 
probes  and  the  primer  extension  method.  Electroporation 
into  E.Coli,  amplification,  and  plating.  Selection  and 
amplification  of  individual  clones.  Extraction  of  DNA  from 
individual  clones  (-100-200). 

RED  analysis  on  the  DNA  extracted  from  the  individual 
clones  (-100-200).  Sequencing  of  inserts  from  RED  positive 
clones.  Identification  of  clones  with  large  repeat  sequences. 
Designing  PCR  primers  for  every  sequence  with  large  CAG- 
repeats  identified  through  cloning.  Screening  of  all  the  cases 
originally  detected  to  have  the  CAG-repeat  expansion 
(detected  by  RED)  by  PCR  and  sequencing. 


Technical  Objective  2:  Identification  and  Characterization  of  Gene(s)  containing 

or  Flanking  Expanded  CAG-Repeats 

Task  1:  Months  11-12  Designing  PCR  protocols  for  all  the  sequences  identified  to 

have  large  CAG-repeats  through  cloning  and  optimization  of 
their  detection  through  microsatellite  analysis.  Sequencing  of 
a  small  panel  of  control  specimens  to  identify  repeats  with 
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varying  sizes  to  be  used  as  size  controls  on  microsatellite 
gels. 

Task  2:  Months  12-17  Microsatellite  analysis  of  100  control  specimens  for  every 

repeat  cloned  to  determine  the  allelic  frequency  of  each 
repeat.  It  is  expected  than  several  regions  containing  large 
CAG-repeats  will  be  identified  from  cloning  of  RED  positive 
DNA  from  8  breast  cancer  cases. 

Task  3:  Months  17-22  Extraction  of  sequence  data  using  different  sequence 

database  sources  including  GenBank,  EST  and  STS 
databases  and  putting  together  the  pieces  of  information  to 
find  longer  sequence  information.  Mapping  of  the  CAG 
repeats  using  public  genetic  map  information,  and  identify 
other  genes  located  around  the  CAG-repeat.  Locating  the 
exact  position  of  CAG  repeat  in  relation  to  the  structure  of 
the  genes  identified. 

Task  4:  Months  20-24  Obtaining  the  sequence  data  from  the  databases  and 

interpretation  of  this  information  with  the  presence  of  an 
extensive  literature  search.  If  necessary,  wet  lab  work  will  be 
carried  out  to  clarify  the  position  of  CAG-repeat  in  relation  to 
the  structure  of  gene(s)  identified.  Search  for  information 
published  on  the  loci/genes  found  and  their  association  with 
cancer  specifically  with  breast  cancer.  Writing  of  manuscripts 
to  be  proposed  to  peer  reviewed  journals. 


6 


Body 

1 .  Overview 

Our  goal  is  to  clone  and  characterize  expanded  CAG  repeat  containing 
sequences  in  breast  cancer  patients.  These  genes  may  represent  breast  cancer 
predisposition  genes.  Most  of  the  reported  methods  utilized  for  library  contraction  and 
cloning  of  the  large  repeat  containing  fragments  require  the  use  of  large  amount  of 
genomic  DNA  as  a  starting  template  (Yuan  et  al,  2001;  Vincent  et  al,  2000;  Koob  et  al, 
1998).  Since  patient  material  is  a  limited  source,  we  were  specifically  interested  in 
developing  an  efficient  cloning  strategy  that  enriches  for  long  CAG  repeat  containing 
fragment  with  the  minimum  use  of  starting  genomic  DNA.  Long  trinucleotide  repeats 
tends  to  form  special  secondary  structures,  which  reduce  the  efficiency  of  cloning  of 
DNA  fragments  containing  expanded  repeats.  Therefore,  cloning  of  long  repeat 
containing  fragments  is  cumbersome.  (Koob  et  al,  1998,  Sanpei  et  al,  1996) 

2.  Establishment  of  the  Cloning  Procedure  for  Repeat  Containing  Sequences 

2A.  Enrichment  for  Long  CAG-Repeat  Containing  Fragments 

Large  CAG  repeat  containing  genomic  DNA  sample  was  digested  with  Sau3AI. 
Adaptors  complementary  to  Sau3AI  cut  sites  were  ligated  to  digested  product.  The 
ligated  product  was  used  as  template  for  a  PCR  reaction  using  one  of  the  adaptors  as  a 
primer.  The  PCR  reaction  resulted  in  a  smear  of  amplified  fragments  that  ranged  in 
length  from  100  bp  to  more  than  3  kb  (Figure  1).  Several  PCR  conditions  have  been 
applied  and  subjected  to  the  further  stages  of  the  procedure. 

The  PCR  product  was  purified  and  a  biotin  labeled  (CAG)s  oligo  was  annealed  to 
the  product  and  a  nucleotide  biased  extension  step  was  performed  using  dATP,  dCTP 
and  dGTP.  The  extension  reaction  product  was  mixed  with  Streptavidin  MagneSphere 
Paramagnetic  Particles  and  then  single  strand  DNA  fragments  were  eluted  and  purified 
by  ethanol  precipitation.  PCR  was  performed  using  the  eluted  fragments  as  template 
and  the  same  adaptor  as  primer.  The  PCR  product  was  then  digested  with  Sau3AI  and 
new  adaptors  were  linked  to  the  digested  product  and  the  whole  enrichment  process 
was  repeated  (Figure  2).  The  final  PCR  product  was  cloned  into  a  T  vector  using  the 
Pgem  T  easy  vector  system  II  kit  (Promega)  and  the  XLI-Blue  MRF’  bacterial  strain. 

This  strain  was  chosen  because  of  its  tolerance  to  host  vectors  containing 
fragments  with  long  repetitive  DNA.  White  colonies  were  picked  and  cultured  for  6  hours 
in  96  well  plates  containing  200ul  of  LB-Ampicillin  medium.  PCR  was  performed  using 
the  cultured  bacteria  directly  as  template  and  theT7/SP6  primer  set  (Figure  3).  We 
noticed  that  most  of  the  PCR  products  were  very  short  which  means  that  the  cloning 
process  was  in  favor  of  short  fragments. 

2B.  Enrichment  by  Gel  Fragmentation 

Since  the  presence  of  small  PCR  products  in  the  sample  resulted  in  competing 
out  the  cloning  of  larger  fragments.  We  have  developed  the  gel  fragmentation  system 
before  the  cloning  step  where  shorter  fragments  (<300bp)  were  removed.  The  final  PCR 
product  was  ran  on  a  1  %  agarose  gel  and  the  gel  was  divided  into  3  pieces:  short 
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fragments  (<300  bp),  medium  fragments  (300-2000  bp),  and  long  fragments  >  3000  bp 

(Figure  4). 


Figure  1.  PCR  product  of  digested  DNA  and  ligated  to  adaptors 


The  DNA  was  recovered  from  the  short,  medium  and  long  portions  of  the  gel  and 
cloned  into  the  T  vector.  Cloned  fragments  were  amplified  by  using  the  vector  specific 
primers  (SP6/T7)  (Figure  5). 

2C.  PCR-Based  Colony  Screening 

Initially,  we  have  sequenced  multiple  colonies  to  assess  the  type  of  fragments 
incorporated  into  the  vectors.  In  order  to  increase  the  efficiency  of  screening  inserts  we 
have  developed  a  PCR-based  screening  method.  The  individual  or  pooled  colonies  can 
be  lyzed  and  directly  amplified.  This  is  an  efficient  method  to  screen  for  colonies  that 
contain  large  repeats,  eliminating  the  colonies  containing  short  inserts.  This  method  will 
be  used  to  enrich  for  colonies  having  larger  (potentially  the  large  repeat)  inserts,  which 
will  be  then  subjected  to  direct  sequencing  for  validation. 

2D.  Enhancement  of  Percentage  of  CAG  Repeat  Containing  Fragments 

DNA  sequencing  was  performed  on  24  PCR  products  resulting  from  cloning  of 
fragments  after  one  round  of  enrichment  and  on  32  PCR  product  resulting  from  cloning 
of  fragments  after  two  round  of  enrichment.  As  shown  in  Tablel  repeating  the 
enrichment  process  for  another  round  prior  to  cloning  has  enhanced  the  percentage  of 
CAG  containing  fragments  from  17%  to  66  %.  Most  of  he  detected  CAG  repeats 
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ranged  in  size  between  (CAG)4  and  (CAG)i2.  We  were  able  to  detect  two  repeats  larger 
than  (CAG)i2;  one  with  (CAG)i3  and  one  with  (CAG)i7. 
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PCR  Amplification 


-*•  Library  construction 


Figure  2:  Schematic  representation  of  enrichment  for  long  CAG  repeat  containing 
fragments  by  using  magnetic  bids. 


2E.  Library  Construction  and  Screening 

We  are  now  in  the  process  of  library  constructing  and  screening  using  the  above 
mentioned  enrichment  and  cloning  strategy.  Using  this  approach  we  will  be  able  to 
identify  the  location  and  the  flanking  sequences  of  the  large  CAG  repeats  which  were 
detected  by  RED  in  our  previous  project. 
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Figure  3.  PCR  products  of  amplified  plasmids  using  t7/SP6  primers 


Long  fragments 


Medium  fragments 


Short  fragments 


3000  bp 


300  bp 


Figure  4.  Enrichment  by  gel  fragmentation 
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Figure  5.  PCR  products  of  amplified  plasmids  using  t7/SP6  primers  after  gel 
fragmentationfragmen 


(CAG)n 

containing 

clones 

no  (CAG)n 
containing 
clones 

Total 

%  of  (CAG)n 
containing 
clones 

One  round 
enrichment 

4 

20 

24 

17 

Two 

rounds 

enrichment 

21 

11 

32 

66 

Table  1 .  Enhancement  of  percentage  of  CAG  containing  clones  by  enrichment  for 
repeats  containing  fragments 
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3.  Immediate  Future  Task 

Using  the  above  optimized  cloning  strategy  we  will  complete  the  screening  for  RED 
positive  clones  in  order  to  identify  expanded  repeat  and  their  flanking  sequences.  In 
case  we  are  not  able  to  detect  any  long  repeat  containing  clones  the  system  needs  to 
be  retested  for  its  efficiency  for  cloning  long  CAG  containing  fragments.  If  necessary 
we  will  seek  collaboration  with  well  established  laboratory  to  complete  the  cloning 
process. 

4.  Key  Research  Accomplishments 

At  this  stage,  we  have  developed  a  cloning  strategy  by  integrating  several  enrichment 
steps  for  efficient  identification  of  large  repeat  containing  fragments.  The  specific 
technical  accomplishments  include: 

(a)  Enrichment  for  CAG  repeat  containing  fragments  using  magnetic  beads 

(b)  Enrichment  by  gel  fragmentation 

(c)  Enhancement  of  the  percentage  of  CAG  repeat  containing  clones 

(d)  Development  of  a  PCR  based  colony-screening  method 

We  have  also  carried  out  a  validation  study  to  assess  the  efficiency  of  our  strategy  to 
detect  large  repeat  containing  fragments. 


5.  Reportable  Outcomes 
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